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Abstract 

How low can the joint entropy of n d-wise independent (for d > 2) discrete random 
variables be, subject to given constraints on the individual distributions (say, no value 
may be taken by a variable with probability greater than p, for p < 1)? This question 
has been posed and partially answered in a recent work of Babai [Bab 13] . 

In this paper we improve some of his bounds, prove new bounds in a wider range of 
parameters and show matching upper bounds in some special cases. In particular, we 
prove tight lower bounds for the min-entropy (as well as the entropy) of pairwise and 
three-wise independent balanced binary variables for infinitely many values of n. 
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1 Introduction 

Suitable choice of a (discrete) distribution is a crucial component that underlies many results 
in extremal combinatorics and theoretical computer sciences (e.g., see |ASr)8| l. It is often 
the case that the “ideal” distribution to use would be mutually independent over n random 
variables Xi,..., (each variable taking one of several possible values); however, “full” 
mutual independence is “too expensive” and a d-wise-independent distribution is used instead 
(e.g., see |LW06| 1. (A string of random variables Xi,... is called d-wise independent if 
any d-tuple of the variables is independent.) Indeed, if all variables are independent, then 
the sample space has at least exponential size, while d-wise independent spaces can be of 
polynomial size if d is constant. This has many applications in computer science. The size of 
the space, the number of random bits needed and the joint entropy of Xi ,..., Xn are closely 
related parameters that are crucial in these applications. 

This is a motivation of the question studied in a recent article of Babai |Babl3j : what 
is the minimum entropy for n pairwise independent variables. Babai showed an asymptot¬ 
ically logarithmic lower bound, by proving a very nice theorem. He proved that for any 
string Xi,..., Xn of pairwise independent binary-valued variables, where the probabilities 
are bounded away from zero and one, there exists a logarithmic size subset of these variables 
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that is almost independent. Such a subset must have entropy asymptotically equal to its size, 
so the logarithmic lower bound follows. 

Our aim in this paper is to answer some questions and improve bounds of Babai. For 
proving tight bounds, a more traditional approach (see for example [Lan65] l based on a 
construction of orthogonal matrices seems more suitable. This approach enables us, first, to 
extend Babai’s bounds to a larger range of parameters, and second, to obtain more precise 
bounds. In particular, we prove that the joint entropy of Xi,...,X„ is logarithmic even 
if the entropy of the variables is only of the order of logre/n, which is the lowest possible. 
Furthermore, we prove a lower bound log(n + l), conjectured by Babai, on the min-entropy of 
pairwise independent balanced binary variables (i.e., when each Xj is equal to 0, respectively 
1, with probability 1/2). This matches the upper bounds given by the well known construction 
based on Hadamard matrices. So the bound is tight if an Hadamard matrix of dimension 
n + 1 exists. 

Lower bounds on the entropy of d-wise independent variables can be obtained from lower 
bounds on pairwise independent variables by a well-known construction that produces a 
longer string of pairwise independent variables. We slightly modify this construction for odd 
values of d which enables us to obtain a matching upper and lower bounds for d = 3 and 
infinitely many values of n (powers of 2). 

Although we are primarily interested in binary-valued variables, we will show that some of 
our lower bounds can be extended to the case of general (finite-outcome) pairwise independent 
variables. 

2 Preliminaries 

We will write 

ff[Xl = ^PUX=xl.logp;^^ 

to denote the Shannon entropy of the (discrete) random variable X, and 

Hmin[X] = min log ^ -r 

for the min-entropy. Clearly, < H[X]. All logarithms are to the base 2. Random 

variables Xi ,..., are said to be d-wise independent if for every s G the variables 

(Aj)jgs are mutually independent. A random variable is called binary if it is supported on 

{ 0 , 1 }. 

Recall Cantelli’s inequality [CanlOj - a strengthening of Chebyshev’s inequality for the 
case of one-sided deviations: 

Lemma 2.1 (Cantelli’s inequality). For every random variable X and real t > 0, 

Pr[A<E[X]-t], Pr[X>B[X]+t]<- -(1) 

1 + Var [X] 

3 Lower bounds 

In this section we give lower bounds on the joint entropy of n d-wise-independent variables. 
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3.1 Pairwise independent binary variables 

Here we give two incomparable entropy lower bounds for the families of pairwise independent 
binary variables. 


Theorem 3.1. Let X = [Xi,... ,Xn) be n pairwise independent binary variables. Let qj = 
Pr [Xj = 1] and suppose that 0 < qj < 1 for j = 1,... ,n. Then 


H[X] > sup 


log(n + 1 — t) 


0<t<n 1 + p- Yll=l 


j = l gj{l-qj) 


Proof. Let A = {ai,..., Um} C {0,1}” be the support of X and pi = Pr [X = Oj]. We will 
denote by the j’th element of a* for j G [n]. 

Define an m x (n + 1) matrix U = {uij} as follows. For all i G [m], 

def !— 

— yPi-i 


and for j G [n], 


Uij — 




ifo„ = l ' 

For 0 < j < n, let Uj denote the j’th column vector of C/; note that these vectors form 
an orthonormal family: For j > 0, 


{uo,Uj) = 


A-qj 


Qj 


• Pr [X, = 1]- I • Pr [X, = 0] = 0; 


Qj 



• Pr [Xfc = 1 A Xj = 1] - d • Pr [Xfc = 1 A Xj = 0] 


Qj 


• Pr [Xfc = 0 A Xj = 1] - J • Pr [Xfc = 0 A X, = 0] 


Qj 


as follows from independence of Xi and Xj. As well, the norm of every Uj is 1. 

Since the matrix Lf is unitary, or can be made unitary by adding more columns, we know 
that the norm of each row of U is at most 1. Thus we get, for all i G [m], 
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Our aim is to find a subset ^ ^ such that every string a £ Aq has a low probability, 
whereas the weight of A is large. This will give us the lower bound on the entropy. 

Let Aq be all the elements of A satisfying 


(1 


j=0 


1 - 93 93 


> n + 1 — t. 


(3) 


W.l.o.g. we may assume that Aq = {ai,... , 0 ^ 0 }- Then, according to ([2]), for every i = 

Pi<l/{n + l-t). (4) 


Let Y be the random variable defined by 
Y : = 


:=l+j; (l-X,)- 

i=i ^ 


9L+x,l^ 


Then we have 


The expectation of Y is 




mo 


Qj 










Pj = Pr [y > n + 1 — t]. 


2 = 1 






Em = i+y (i-«)j^+,3^)="+i 




The variance of Y is 

Var [Y] = Var 






y^Var 

i=i 


Qji^-Qj) 


■X. 


V Var[X,l = f:E:^, 
' e 93 ( 1 - 93 ) 


V93(1 - 93 ) 


J = l ' J = 

where we have used the fact that the variables are pairwise independent. Now we apply 
Cantelli’s inequality fLemma 12.ip to the random variable Y and parameter t. 


mo 


= Pr [y > n + 1 — t] = Pr [y > E [y] — t] > 
1 1 


2=1 


l + ^Var[rl 

Using this inequality and the fact that p~^ > n + 1 — t for alH E [mo] (which is ([!])), we get 


mo 




i=l 


i=l 


1 1 1 sr^n (l-2g,)^ ’ 

i + F A.j = l g,.(l-g,) 


as required. 
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Suppose that 0 < g < < 1/2 for some q and all j. Since | > 

log(n + 1 — t) 


H[X\ > sup 

0<t<n 


1 + 


( 5 ) 




In particular, for t = n/2, 


H[X]> 


log(n/2 + 1) 

1 + — 
nq 


This proves that the entropy of X is 12(log n) as long as qj > en j = 1,..., n, for some 
e > 0, i.e., if H[Xj] = 12(logn/n). On the other hand, if qj < q{n), j = 1,... ,n, for some 
q{n) = o{n~^), then H[Xj] = o(n~^ logn), and thus H[X] = o(logn). 

If all qj = 1/2 we get H[X] > log(n + 1) by taking t —>• 0. This is tight for infinitely 
many values of n (see Section 0]) and confirms Conjecture 1.2 of Babai |Babl3j . However, 
the following theorem implies the same bound even for the min-entropy and the proof is, in 
fact, more direct. 

Theorem 3.2. Let X = [Xi,... ,Xn) be n pairwise independent binary variables. Let qj = 
Pr [Xj = 1] and suppose that 0 < qj < 1 for j = 1,... ,n. Then 


Hmin[X] > log j 1 + ^ min 
1=1 


1 - qj qj 


qj 1 - qj 


Proof. Let [/ be an m x (n +1) matrix as in the proof of Theorem l3.ll assuming again w.l.o.g. 
that Pr [Xj = 1] < Pr [Xj = 0] always. From ([2|) we get that for all i G [m]. 


(i+ E E 

j:aij=0 •> j-.aij=l 


1-qj 


> Pi ■ 


1 + min 
1=1 


1 - qj qj 


qj ’ 1 - qj 


which gives us the required lower bound on pfs. 
Corollary 3.3. If all qj > q, then 

H^in[X] >log(l + -^ 

V 1-9 

For q = 1/2 (unbiased Xi's), this corollary gives 

Hmin[X] > log (n + 1), 
which is tight for infinitely many values of n. 


( 6 ) 


3.2 Pairwise independent finite-outcome variables 

Let [k] be the values that a random variable Xj takes on, k > 2. 
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Theorem 3.4. Let X = (Xi,... ,X„) be pairwise independent variables that take on values 
in [k], k > 2. Let w be such that for all i £ [n],j£ [k], 

Pr[Xj = j]<w (i.e., Hmin[Xi] > -log-wj. 


Ifw> 1 / 2 , 


Ifw< 1 / 2 , 

To prove the theorem, we need the following technical statement. 

Claim 3.5. For k > 2, let bi,... ,bk >0 be such that Ylt =2 t > 2, bt < bi. 

Then there exist 0 : 2 ,..., Ofc G M, such that 

t 

t=2 

Proof of Claim [X5l Let Cr denote the circle in the complex plane with radius r and center 
in 0. The claim is equivalent to the statement that 61 is in the Minkowski sum of , Cb,,. 

Note that if r < s, then Cr + Cg contains Cg as a subset. Thus the sum C 2 + ■ ■ ■ + Ck is either 

a region between _hfefc some smaller circle, or a disc with radius b 2 + ■ ■ ■ + b^ - in 

any case, it contains both C'max 2 <t<fe fet and _i-bfc- Hence it also contains bi. ^ciaimtm 

Proof of Theorem \3.4\ The proof is a modification of the proofs of Theorems 13.11 and 13.21 

def 

Let A = {ai,..., am} C [k]"^ be the support of X and pi = Pr [X = o*]. For j G [re], let 

def 

Wj = max {Pr [Xj = t] 11 G [fc]} and assume without loss of generality that Pr [Xj = 1] = wj. 
Let LOj = max{l, and let aj 2 ,. ■ ■ be the values guaranteed by Claim 1331 for 

61 = Wjfu}j and bt = Pr [Xj = t] for 2 <t < k (which observes the claim requirements). 

This time we define the matrix U over C: for i G [m], 

def ,— 

and for j E [n], 

def J ^3 ' ~ ^ 

\ \/Pi^ • if = z > 1 

As before, let Uj denote the j’th column vector of U. Then, by the immediate adaptation 
of the argument we gave for Theorem 13.11 (taking into account the guarantees of Claim 13.51) , 
it holds that for all j / A; > 0, 


then 


then 


Hmin [X] ^ log 


1 — W 


W 


re + 1 . 


Hmin[X] > log(re + 1). 


{uj,Uk) = 0 and ||rej|| = 1. 

Therefore, the norm of each row of U is at most 1 and, for every i, 

l>P^■(l+ "" 




where iVmax =^max{a;j \j G [re]}. The result follows. 


® Theorem ]3.4\ 
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3.3 (i-wise independent unbiased binary variables 


One can use an idea from |ABI86j to derive from the case of pairwise independent variables 
stronger lower bounds for d-wise independent variables. We will demonstrate it only on 
Theorem [221 but the same idea can be used together with other lower bounds on the entropy 
of pairwise independent variables. 

Theorem 3.6. Let X = {Xi,..., X^) where Xj's are d-wise independent unbiased binary 
variables. If d is even, then 

/d/2 

Hmin[X] > log 

\i=0 

If d is odd, then 



Proof. Let d be even. We define Y = (Yi,..., Ym) , where all Yi's are unbiased binary variables 
equal to the parity of at most d/2 variables Xi and m = (^) (every Yi is unique). Clearly, 

Yi,..., Ym are pairwise independent, and from Theorem 13.21 we get 

/ d/2 \ /d/2 

Hmin [^] ^ Hmin [T] > log ( 1 + E(") hiodg 

If d is odd, we take the parities of at most (d — l)/2 variables Xi and the parities of Xi 
with exactly (d—1)/2 other variables. The resulting variables are again pairwise independent. 




In the next section we will see, in particular, that the above bound is tight for the case 
of d = 3 and n being a power of 2. 

4 Upper bounds 

In this section we review some constructions of d-wise independent unbiased binary variables 
with low entropies. The constructions are based on known ideas, and they are included here 
to argue optimality of the lower bounds from Section [3l 

The standard way of constructing d-wise independent distributions is using parity check 
matrices of codes with minimum distance > d. In such matrices every d columns are linearly 
independent. Hence, if we take the space of vectors generated by the rows of such a matrix, 
i.e., the dual code, we obtain d-wise independent variables. Over GF 2 , these are balanced 
binary variables. To get matching bounds we have to find suitable codes. 

We start with the case of pairwise independent variables (d = 2). Recall that an 
Hadamard matrix is a real matrix with entries ±1 whose rows (and hence also columns) 
are orthogonal. Hadamard matrices exist for infinitely many dimensions, in particular for 
every power of 2. Given an Hadamard matrix of dimension n -|- 1, first transform it into 
an Hadamard matrix with the first column having all Is, then delete the first column. The 
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resulting (n + 1) x n matrix defines in a natural way an Hadamard code and n pairwise 
independent balanced binary variables supported on a set of size n + 1. 

Lancaster [Lan65j proved: 

1. For every n > 2, there exist at most n pairwise independent random variables on a 
probability space with n + 1 points. 

2. The existence of such random variables where, additionally, each point in the proba¬ 
bility space has measure is equivalent to the existence of an Hadamard matrix of 
dimension n + 1. 

Our proofs of Theorems 13.11 and 13.21 can be viewed as an extension of an argument used by 
Lancaster to prove 2. Lancaster considered general (finite-outcome) pairwise independent 
variables. For unbiased binary variables, we can prove the following. 

Theorem 4.1. The existence of n pairwise independent unbiased binary variables with en¬ 
tropy equal to log(n -|- 1) is equivalent to the existence of an Hadamard matrix of dimension 
n-\- 1. 

Proof. As shown above, an Hadamard matrix of dimension n -|- 1 gives rise to n pairwise 
independent unbiased binary variables with entropy equal to log(n -|- 1). 

To prove the converse, let n pairwise independent unbiased binary variables with entropy 
equal to log(n-|- 1) be given. According to Theorem 13.21 every point in the probability space 
has measure at most Since the entropy is log(n -|- 1), this implies that there are exactly 
n -|- 1 points, each with measure The existence of an Hadamard matrix of dimension 

n -|- 1 then follows from Lancaster’s theorem, or from our proof of Theorem 13.11 □ 

Another case where we can precisely match the lower bound for infinitely many values of 
n is d = 3. Let n = 2^ and consider the {I 1) x n binary matrix whose first row consists of 
I's and the columns restricted to the remaining I rows are all vectors of length 1. Every two 
different columns are linearly independent over because they are different. Every three 

different columns are also independent because every two of them are and they cannot sum to 
zero vector due to the first row. Hence the space generated by the rows is 3-wise independent. 
The size of the space is 2^+^ = 2n, precisely matching the statement of Theorem 13.61 Thus 
we have: 

Theorem 4.2. If n is a power of 2, then the minimum of Hmin[X] taken over all n-tuples 
of 3-wise independent unbiased binary variables is log2n. 

Note that the above construction is based on the parity-check matrix of the Hamming 
code: first we extend the matrix by a column with all zeros and then we extend it by a row 
with all ones. The two constructions, one based on the Hadamard code and the other based 
on the Hamming code, can be generalized using BCH codes. Recall that the binary BCH 
code of length 2”^ — 1 and designed distance 2t -I- 1 has the minimal distance at least 2f -|- 1 
and dimension 2”* — 1 — mt, provided that m is sufficiently large with respect to t (see |MS83j , 
pages 258 and 253). Hence every 2t columns of the parity-check matrix (and also of the dual 
code) are linearly independent and the dimension of the space generated by the parity-check 
matrix (i.e., the dual code) has dimension mt. Thus for d > 2 even, we can take a BCH code 






with designed distance 2t + 1 = d + 1 and we get n = 2™ — 1 d-wise independent random 
variables with min-entropy 

-log(n+ 1). 

For d > 3 odd, we take 2t + 1 = d, and extend the parity-check matrix matrix by a column 
of zeros and a row of ones, as we did above. Thus we obtain a matrix with every d columns 
independent. Let n = 2”* be the number of columns of this matrix. The linear space 
generated by the rows gives a probability space of n d-wise independent random variables 
with min-entropy 

log re-h 1. 

These bounds are asymptotically equal to the lower bounds of Theorem 13.61 when re goes to 
infinity. However, we have not been able to find constructions matching our lower bound 
exactly for any d > 4 and any re. 

5 Conclusions 

We proved several lower bounds on the entropy of pairwise and d-wise independent random 
variables. Our lower bounds match upper bounds exactly, or asymptotically for some special 
values of the parameters involved. But for most values of parameters, we do not know even 
the asymptotic behavior of the dependence of entropy on them. This is, in particular, so in 
the case of equally distributed pairwise independent 0-1 variables. In this special case we 
have two bounds ([51) and ([H), which give an asymptotically optimal bound for g Ri 1/re and a 
tight bound for q = 1/2, but for other values we do not know. Another interesting problem, 
studied in |Babl3] . is to hnd the best lower bound on the joint entropy H\Xi, ..., of a 
string of pairwise random variables Xi,... , X„ in terms of the parameter L := Xj. For 
more open problems, see |Babl3| . 
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